Layer-specific, retinotopically-diffuse modulation in human visual cortex in response to viewing emotionally expressive faces

Viewing faces that are perceived as emotionally expressive evokes enhanced neural responses in multiple brain regions, a phenomenon thought to depend critically on the amygdala. This emotion-related modulation is evident even in primary visual cortex (V1), providing a potential neural substrate by which emotionally salient stimuli can affect perception. How does emotional valence information, computed in the amygdala, reach V1? Here we use high-resolution functional MRI to investigate the layer profile and retinotopic distribution of neural activity specific to emotional facial expressions. Across three experiments, human participants viewed centrally presented face stimuli varying in emotional expression and performed a gender judgment task. We found that facial valence sensitivity was evident only in superficial cortical layers and was not restricted to the retinotopic location of the stimuli, consistent with diffuse feedback-like projections from the amygdala. Together, our results provide a feedback mechanism by which the amygdala directly modulates activity at the earliest stage of visual processing.

The data reported in this study include 43 scan sessions from 25 participants (average age 25.9 years, 12 females, see Table 1), consisting of 14 scan sessions from 14 unique participants at 3T BOLD, 14 scan sessions from 14 unique participants at 7T BOLD, and 15 scan sessions from 10 unique participants at 7T VASO, among whom 5 were scanned twice to evaluate test-retest reliability of VASO. The sample size was chosen based on norms in the field. Our sample size is no smaller than those in many similar human fMRI studies, especially layer fMRI studies.
A total of 53 2-hour scan sessions from 34 healthy right-handed volunteers (age 21-42 years, 16 females) from the DC/MD/VA tri-state area were collected in this series of experiments (7T VASO, 7T BOLD, and 3T BOLD). Based on conservative head motion parameter estimates across different magnetic strength or voxel size, seven 7T VASO scan sessions from six participants were excluded due to excessive head motion (>1 mm translation or >1°rotation within each run and/or >2 mm translation or >2°rotation across runs within a single scan session). Data from an additional 3 participants were further excluded because of technical errors, lack of scan time, or outlier behavioral performance (>3 SD below mean accuracy). Hence, the final dataset reported here includes a total of 43 scan sessions from 25 participants (average age 25.9 years, 12 females, see Table 1), consisting of 14 scan sessions from 14 unique participants at 3T BOLD, 14 scan sessions from 14 unique participants at 7T BOLD, and 15 scan sessions from 10 unique participants at 7T VASO, among whom 5 were scanned twice to evaluate testretest reliability of VASO (see Supplementary Fig. 6). 11 out of 25 participants participated in more than one scan session across three experiments (3T BOLD, 7T BOLD, 7T VASO).
Reporting for specific materials, systems and methods We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 2. Replication of behavioral task performance during the scan Test-retest reliability of task performance showed that behavior was highly consistent across scan sessions on different days within participants ( Supplementary Fig. 8).

Materials
Randomization is not applicable since there were no groups in this study. This study adopts within-subject design.
Blinding is not applicable since there were no groups in this study. This study adopts within-subject design.

Task, block design
In both 3T BOLD and 7T BOLD experiments, each run consisted of three repeats of each facial expression condition (fearful, neutral, happy) and ten repeats of the fixation block. Within each facial expression block, each face was presented for 900 ms with a 100 ms interstimulus interval (ISI) while a green fixation cross remained at the center of the screen at all times. Each fixation block lasted 10 s and each face block lasted 20 s in the 3T BOLD experiment. Hence, each run lasted 4 min 40 s in total. Similarly, each fixation block lasted 9 s and each face block lasted 18 s in the 7T BOLD experiment; thus each run lasted 4 min 12 s in total.
In the 7T VASO experiment, each run consisted of six repeats of each facial expression condition (fearful, neutral, happy) and 19 repeats of the fixation blocks. There were 16 faces presented in each face block. Each face was shown for 1100 ms with a 106.25 ms ISI. Thus, each face block lasted 19.3 s and each fixation block lasted for 9.65 s, and each run lasted 8 min 53 s.
Participants performed a gender judgment task (press "1" for female, "2" for male) for each face stimulus, unrelated to facial expressions. Feedback on task performance (percent correct) and real-time head motion estimates were given to the participant shortly after each run; no feedback was given during scanning.
VASO scan parameters: 7T VASO data were acquired using an inversion recovery prepared 3D-EPI sequence, which was optimized for layerspecific fMRI in human visual cortex. Parameters of inversion recovery preparation were as follows: The adiabatic VASO inversion pulse is based on the TR-FOCI pulse, with a duration of 10 ms and a bandwidth of 6.3 kHz. The inversionefficiency was adjusted by the implementation of a phase skip of 30 deg to minimize the risk of inflow of fresh noninverted blood into the imaging region during the blood nulling time. 7T VASO data were acquired using a 3D-EPI readout with the following parameters: 0.82 x 0.82 x 0.82 mm, FOV read=133mm, 26 slices, whole k-space plane acquired after each shot, FOV in first phase encoding direction = 133.3% of FOV in readout direction, TE=24ms, GRAPPA 3, partial Fourier of 6/8. To account for the T1-decay during the 3D-EPI readout and potential related blurring along the segment direction, a variable flip angle (FA) was applied across segments, which started from 22°, and then exponentially increased until reaching a desired flip angle of 90°.
The acquired time series consisted of interleaved BOLD and VASO images, with TR(BOLD)=2737ms and TR(VASO)=2088 ms, resulting in effective TR(VASO+BOLD)=4825ms. Details of the 7T sequence and scan parameters can be found: https://github.com/tinaliutong/sequence.
Imaging slice position and angle were adjusted individually for each 7T VASO participant so that the slice prescription was parallel to each participant's calcarine sulcus (visualized on the sagittal plane prior to the scan, see Supplementary  Fig. 6a). We also ran the retinotopic atlas analysis based on each participant's T1-weighted MPRAGE MRI, acquired in a separate session prior to the main experimental scan session. This was used to guide slice prescription, aiming to maximally cover V1 in each participant. After slice prescription, a 3rd order B0-shimming was done with four iterations. The shim volume was parallel to the slice prescription.
Image reconstruction was done in the vendor-provided platform (Siemens software identifier: IcePAT WIP 571) and was optimized with the following set-up to minimize image blurring and increase tSNR at high resolution. GRAPPA kernel fitting was done on FLASH autocalibration data with a 3x4 kernel, 48 reference lines and regularization parameter #= 0.1. Partial Fourier reconstruction was done with the projection onto convex sets (POCS) algorithm with eight iterations. Data of each coil channel were combined with the sum-of-squares.
Structural MRI Within the same 3T scan session, anatomical images were acquired in each individual for co-registration purpose using a 3D Magnetization-Prepared Rapid Acquisition Gradient Echo (MPRAGE) sequence with 1 mm isotropic voxels, 176 sagittal slices, acquisition matrix = 256 x 256, TI/TE/TR = 900/1.97/2300 ms, flip angle = 9 " , GRAPPA = 2, scan time = 5 min 21s. The 3T anatomy was also used for co-registration of all 3T BOLD participants and 8 of 14 7T BOLD participants (who participated in both 3T BOLD and 7T BOLD scans). In other 7T participants, a 0.7 mm isotropic resolution T1-maps were collected covering the entire brain using an MP2RAGE sequence with TI1/TI2/TR/TE = 800/2700/6000/3.02 ms, FA1/FA2 = 4°/5°, 224 sagittal slices, matrix size = 320 × 320, scan time = 10 min 8 s. Before the VASO scan, we made sure all participants had prior MPRAGE data available, which was used to estimate the slice angle of the VASO scan.
3T BOLD: Whole-brain coverage 7T BOLD: Up to 2/3 whole-brain coverage with full coverage of the temporal and occipital lobes and partial coverage of the parietal and frontal lobes. 7T VASO: Given that the high-resolution fMRI pulse sequences used in this study currently do not permit a whole-brain field of view, we guided slice prescription to maximally cover V1 in this experiment. Imaging slice position and angle were adjusted individually for each 7T VASO participant so that the slice prescription was parallel to each participant's calcarine sulcus (visualized on the sagittal plane prior to the scan, see Supplementary Fig. 6a). We also ran the retinotopic atlas analysis based on each participant's T1-weighted MPRAGE MRI, acquired in a separate session prior to the main experimental scan session.
All preprocessing steps were implemented in MATLAB 2016b using a combination of mrTools78 and AFNI software package. Standard preprocessing of the 3T multi-echo gradient echo EPI data utilized the AFNI software program afni_proc.py. Data from the first 4 TRs were removed to allow for T1 equilibration and to allow the hemodynamic response to reach steady state. Advanced automatic denoising was achieved using multi-echo EPI imaging and analysis with spatial independent component analysis (ICA), or ME-ICA. Preprocessing of 7T BOLD data included head movement compensation within and across runs, linearly detrended, and high-pass filtered (cutoff: 0.01 Hz) to remove low-frequency noise and drift. For 7T VASO data, all time frames were first split into blood-nulled and blood-not-nulled (BOLD) groups. Motion correction was performed separately for each group. The time frames from each group were upsampled in time via cubic interpolation and the first and last two upsampled time frames in each group were removed from each run. Next, CBV-weighted VASO signals were calculated as blood-nulled divided by blood-not-nulled (BOLD) at each time frame to remove BOLD contamination.
No normalization was applied. All data analysis was conducted in native space.
7T VASO experiment: To ensure the most accurate definition of cortical depths, we used the functional 7T VASO data directly to generate an anatomical reference, termed VASO anatomy. It was computed through dividing the inverse signal variability across blood-nulled and blood-not-nulled images by mean signals. This measure is also called T1-EPI, which provides a good contrast between white matter (WM), gray matter (GM) and cerebral spinal fluid (CSF; see Fig. 3b) in native EPI space.

March 2021
Normalization template Multivariate modeling or predictive analysis 3T and 7T experiment: For visualization purposes, we applied the freesurfer average cortical surface template to visualize the valence effect on the inflated cortical surface at the group level (Figure 1b, Supplementary Figures 1 and 4) freesurfer average cortical surface template was used for visualizing group effects (Figure 1b, Supplementary Figures 1 and 4) 3T BOLD: Data from the first 4 TRs were removed to allow for T1 equilibration and to allow the hemodynamic response to reach steady state. Advanced automatic denoising was achieved using multi-echo EPI imaging and analysis with spatial independent component analysis (ICA), or ME-ICA.
7T BOLD: Data were head movement compensated within and across runs, linearly detrended, and high-pass filtered (cutoff: 0.01 Hz) to remove low-frequency noise and drift.
7T VASO: All time frames were first split into blood-nulled and blood-not-nulled (BOLD) groups. Motion correction was performed separately for each group. The time frames from each group were upsampled in time via cubic interpolation and the first and last two upsampled time frames in each group were removed from each run. Next, CBV-weighted VASO signals were calculated as blood-nulled divided by blood-not-nulled (BOLD) at each time frame to remove BOLD contamination.
No volume censoring was conducted.
Standard general linear model (GLM) analyses were performed. The regressor for each condition of interest (faces, objects, and scrambled objects in the face localizer task, or fearful, neutral, happy in the gender judgement task) was convolved with a canonical hemodynamic response function.
A region-based analysis was performed through Bayesian Multilevel (BML) modeling through the AFNI program RBA (https:// afni.nimh.nih.gov/pub/dist/doc/program_help/RBA.html). The BML modeling results are presented with each region's posterior distribution (Fig. 1c, Supplementary Fig. 4b). Each contrast between two conditions C1 and C2 was expressed as a dimensionless modulation index (C1-C2)/(|C1|+|C2|), whose posterior distribution was represented through the posterior samples drawn from the Markov chain Monte Carlo simulations of the BML model. The strength of statistical evidence is shown through P+, the posterior probability of each region's effect being positive conditioning on the adopted BML model and the current data. See the BML model performance in Supplementary Fig. 4c. See Methods for equations and details.
The correlation coefficients between each pair of ROIs, for fearful and neutral conditions, were computed based on the residual time series (measured response time series -predicted response time series estimated using deconvolution) (Fig. 2a) and their difference in correlation (fearful -neutral) was entered into Wilcoxon signed-rank test (Fig. 2b). The beta weights (in units of percent signal change) and t statistics for the fearful, happy, and neutral conditions were entered into Bayesian Multilevel ( visual areas per hemisphere were defined in these 23 scan sessions (from 15 unique participants). Next, visual areas with the same area label were combined across hemispheres (with IPS1-5, LO1-LO2, PHC1-PHC2, TO1-TO2, V1d-V1v, V2d-V2v, V3A-V3B, V3d-V3v, VO1-VO2 combined) and were further thresholded by R2 value in the independent face localizer (R2>0.1 at both 3T BOLD and 7T BOLD). We functionally defined the amygdala and the FFA using an independent localizer (see Methods).
3T BOLD, 7T BOLD, and 7T VASO: We performed a retinotopic analysis (Fig. 4a, Supplementary Fig. 6a) using a probabilistic atlas (Benson et al., 2014, https://doi.org/10.1371/journal.pcbi.1003538) to segment V1 based on visual eccentricity (deg). The eccentricity map was visualized on a flat patch of early visual cortex and a portion of central V1 corresponding to the size and position of the face stimuli was highlighted by the yellow contour ( Fig. 4a-b).
No voxel-wise inference was drawn in this study except defining functional ROIs using the independent face localizer. The amygdala and the FFA were functionally defined from the independent face localizer using a conjunction between t map of faces-objects (whole brain FDR<0.05) and R2 map (R2>0.1 for 11 scan sessions at 3T BOLD or R2>0.05 for 12 scan session at 7T BOLD). fMRI response from the 13 anatomical ROIs in the visual cortex and functionally defined amygdala and FFA (in Figure 1 and Supplementary Figure 4) were thresholded by R2 value in the independent face localizer (R2>0.1 at both 3T BOLD and 7T BOLD).
Multiple comparison was performed for number of ROIs in Figure 1, for number of cortical depths in Figure 3 and for number of eccentricity bins in Figure 4.